Can Common Crawl reliably track persistent identifier (PID) use over time?
نویسندگان
چکیده
We report here on the results of two studies using two and four monthly web crawls respectively from the Common Crawl (CC) initiative between 2014 and 2017, whose initial goal was to provide empirical evidence for the changing paerns of use of so-called persistent identiers. is paper focusses on the tooling needed for dealing with CC data, and the problems we found with it. e rst study is based on over 1012 URIs from over 5x109 pages crawled in April 2014 and April 2017, the second study adds a further 3x109 pages from the April 2015 and April 2016 crawls. We conclude with suggestions on specic actions needed to enable studies based on CC to give reliable longitudinal information.
منابع مشابه
Design and Analysis of Discrete-Time Repetitive Control for Scanning Probe Microscopes
This paper studies repetitive control (RC) with linear phase lead compensation to precisely track periodic trajectories in piezo-based scanning probe microscopes (SPMs). Quite often, the lateral scanning motion in SPMs during imaging or nanofabrication is periodic. Dynamic and hysteresis effects in the piezoactuator cause significant tracking error. To minimize the tracking error, commercial SP...
متن کاملCough Syrup Use in Infants, A Dangerous Practice
Sleep disturbance is a very common finding in patients with persistent cough1. This is especially apparent in infants with common cold or influenza and the continuous crying can lead to considerable distress for the parents. Cough syrups which are easily available as over the counter medications can induce sleep, and many parents turn to this medication when their infant is suffering from persi...
متن کاملSliding Mode Control with Predictive PID Sliding Surface for Improved Performance pdfkeywords=Sliding Mode Control, Sliding surface, Predictive PID, GPC
In this paper, a sliding mode control system with a predictive proportional-integral-derivative (PPID-SMC) sliding surface is proposed. A robust sliding mode controller is suggested to track the desired trajectory despite uncertainty, set point variations, and external disturbances. The proposed sliding mode controller is chosen to ensure the stability of overall dynamics during the reaching ph...
متن کاملDynamic Modeling, Assembly and implementing Quadrotor UAV Using PID Controller
in the past decade, paying attention to the vertical fliers has been noted by most of the scientist and researchers, because of their exclusive features. The special capabilities of these, reducing radar identifier, low risk for human life, no restrictions on size and uses such as photography, survey, press coverage, checking, power lines, meteorological analysis, traffic, monitoring, in urban ...
متن کاملArchiving Temporal Web Information: Organization of Web Contents for Fast Access and Compact Storage
We address the problem of archiving dynamic web contents over significant time spans. Current schemes crawl the web contents at regular time intervals and archive the contents after each crawl regardless of whether or not the contents have changed between consecutive crawls. Our goal is to store newly crawled web contents only when they are different than the previous crawl, while ensuring accu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.01424 شماره
صفحات -
تاریخ انتشار 2018